Capacity analysis is not about how much information
you can collect; it is about collecting the appropriate system health
indicators and the right amount of information. Without a doubt, you can
capture and monitor an overwhelming amount of information from
performance counters. There are more than 1,000 counters, so you’ll want
to carefully choose what to monitor. Otherwise, you might collect so
much information that the data will be hard to manage and difficult to
decipher. Keep in mind that more is not necessarily better with regard
to capacity analysis. This process is more about efficiency. Therefore,
you need to tailor your capacity-analysis monitoring as accurately as
possible to how the server is configured.
Every Windows Server 2008 R2
server has a common set of resources that can affect performance,
reliability, stability, and availability. For this reason, it’s
important that you monitor this common set of resources, namely CPU,
memory, disk, and network utilization.
In addition to the common
set of resources, the functions that the Windows Server 2008 R2 server
performs can influence what you should consider monitoring. So, for
example, you would monitor certain aspects of system performance on file
servers differently than you would for a domain controller running on
Windows Server 2008 R2 AD DS. There are many functional roles (such as
file and print sharing, application sharing, database functions, web
server duties, domain controller roles, and more) that Windows Server
2008 R2 can perform, and it is important to understand all those roles
that pertain to each server system. By identifying these functions and
monitoring them along with the common set of resources, you gain much
greater control and understanding of the system.
The following sections go more
in depth on what specific items you should monitor for the different
components that constitute the common set of resources. It’s important
to realize, though, that there are several other items that should be
considered regarding monitoring in addition to the ones described in
this article. You should consider the following material a baseline of
the minimum number of things to begin your capacity-analysis and
performance-optimization procedures.
Key Elements to Monitor for Bottlenecks
As mentioned, four resources
compose the common set of resources: memory and pagefile usage,
processor, disk subsystem, and network subsystem. They are also the most
common contributors to performance bottlenecks. A bottleneck can be
defined in two ways. The most common perception of a bottleneck is that
it is the slowest part of your system. It can either be hardware or
software, but generally speaking, hardware is usually faster than
software. When a resource is overburdened or just not equipped to handle
higher workload capacities, the system might experience a slowdown in
performance. For any system, the
slowest component of the system is, by definition, considered the
bottleneck. For example, a web server might be equipped with ample RAM,
disk space, and a high-speed network interface card (NIC), but if the
disk subsystem has older drives that are relatively slow, the web server
might not be able to effectively handle requests. The bottleneck (that
is, the antiquated disk subsystem) can drag the other resources down.
A less common, but
equally important, form of bottleneck is one where a system has
significantly more RAM, processors, or other system resources than the
application requires. In these cases, the system creates extremely large
pagefiles, has to manage very large sets of disk or memory sets, yet
never uses the resources. When an application needs to access memory,
processors, or disks, the system might be busy managing the idle
resource, thus creating an unnecessary bottleneck caused by having too
many resources allocated to a system. Thus, performance optimization
means not having too few resources, but also means not having too many
resources allocated to a system.
Monitoring System Memory and Pagefile Usage
Available system memory is
usually the most common source for performance problems on a system. The
reason is simply that incorrect amounts of memory are usually installed
on a Windows Server 2008 R2 system. Windows Server 2008 R2 tends to
consume a lot of memory. Fortunately, the easiest and most economical
way to resolve the performance issue is to configure the system with
additional memory. This can significantly boost performance and upgrade
reliability.
There are many significant
counters in the memory object that could help determine system memory
requirements. Most network environments shouldn’t need to consistently
monitor every single counter to get accurate representations of
performance. For long-term monitoring, two very important counters can
give you a fairly accurate picture of memory pressure: Page Faults/sec
and Pages/sec memory. These two memory counters alone can indicate
whether the system is properly configured and experiencing memory
pressure. Table 1 outlines the counters necessary to monitor memory and pagefile usage, along with a description of each.
Table 1. Important Counters and Descriptions Related to Memory Behavior
Object | Counter | Description |
---|
Memory | Committed Bytes | Monitors
how much memory (in bytes) has been allocated by the processes. As this
number increases above available RAM so does the size of the pagefile
as paging has increased. |
Memory | Pages/sec | Displays the amount of pages that are read from or written to the disk. |
Memory | Pages Output/sec | Displays virtual memory pages written to the pagefile per second. Monitor this counter to identify paging as a bottleneck. |
Memory | Page Faults/sec | Reports both soft and hard faults. |
Process | Working Set, _Total | Displays the amount of virtual memory that is actually in use. |
Paging file | %pagefile in use | Reports
the percentage of the paging file that is actually in use. This counter
is used to determine if the Windows pagefile is a potential bottleneck.
If this counter remains above 50% or 75% consistently, consider
increasing the pagefile size or moving the pagefile to a different disk. |
By default, the Memory tab in Resource Monitor, shown in Figure 1,
provides a good high-level view of current memory activity. For more
advanced monitoring of memory and pagefile activity, use the Performance
Monitor snap-in.
Systems experience page
faults when a process requires code or data that it can’t find in its
working set. A working set is the amount of memory that is committed to a
particular process. When this happens, the process has to retrieve the
code or data in another part of physical memory (referred to as a soft
fault) or, in the worst case, has to retrieve it from the disk subsystem
(a hard fault). Systems today can handle a large number of soft faults
without significant performance hits. However, because hard faults
require disk subsystem
access, they can cause the process to wait significantly, which can
drag performance to a crawl. The difference between memory and disk
subsystem access speeds is exponential even with the fastest hard drives
available. The Memory section of the Resource Monitor in Performance
Monitor includes columns that display working sets and hard faults by
default.
The Page Faults/sec
counter reports both soft and hard faults. It’s not uncommon to see this
counter displaying rather large numbers. Depending on the workload
placed on the system, this counter can display several hundred faults
per second. When it gets beyond several hundred page faults per second
for long durations, you should begin checking other memory counters to
identify whether a bottleneck exists.
Probably the most
important memory counter is Pages/sec. It reveals the number of pages
read from or written to disk and is, therefore, a direct representation
of the number of hard page faults the system is experiencing. Microsoft
recommends upgrading the amount of memory in systems that are seeing
Pages/sec values consistently averaging above 5 pages per second. In
actuality, you’ll begin noticing slower performance when this value is
consistently higher than 20. So, it’s important to carefully watch this
counter as it nudges higher than 10 pages per second.
Note
The Pages/sec
counter is also particularly useful in determining whether a system is
thrashing. Thrashing is a term used to describe systems experiencing
more than 100 pages per second. Thrashing should never be allowed to
occur on Windows Server 2008 R2 systems because the reliance on the disk
subsystem to resolve memory faults greatly affects how efficiently the
system can sustain workloads.
System memory (RAM) is limited
in size and Windows supplements the use of RAM with virtual memory,
which is not as limited. Windows will begin paging to disk when all RAM
is being consumed, which, in turn, frees RAM for new applications and
processes. Virtual memory resides in the pagefile.sys file, which is
usually located in the root of the system drive. Each disk can contain a
pagefile. The location and size of the pagefile is configured under the
Virtual Memory section, shown in Figure 2.
To access the Performance Options window, do the following:
1. | Click Start.
|
2. | Right-click on Computer and select Properties.
|
3. | Click on the Advanced System Settings link on the left.
|
4. | When the System Properties window opens, click the Settings button under the Performance section.
|
5. | Select the Advanced tab.
|
6. | Click Change under Virtual Memory.
|
Tip
Windows
will normally automatically handle and increase the size of
pagefile.sys as needed; however, in some cases you might want to
increase performance and manage virtual memory settings yourself.
Keeping the default pagefile on the system drive and adding a second
pagefile to another hard disk can significantly improve performance.
Spanning virtual memory
across multiple disks or simply placing the pagefile.sys on another,
less-used disk, will also allow Windows to run faster. Just ensure that
the other disk isn’t slower than the disk pagefile.sys is currently on.
The more physical memory a system has, the more virtual memory will be
allocated.